feat(github): Add cursor-based pagination to integration repos endpoint by jaydgoss · Pull Request #112591 · getsentry/sentry

jaydgoss · 2026-04-09T17:13:44Z

Summary

Add opt-in cursor-based pagination to GET /organizations/{org}/integrations/{id}/repos/, triggered when a caller sends per_page without search
Add get_repositories_paginated() to BaseRepositoryIntegration (returns None by default) with a GitHub override that slices the Django-cached repo list
Existing consumers that don't send per_page continue to receive the full repo list unchanged

This supports the SCM onboarding repo selector which needs fast page-at-a-time loading for orgs with many GitHub repos.

Stacks on #112327 (VDY-68 caching).

Test plan

Existing OrganizationIntegrationReposTest tests pass (no behavior change for current consumers)
New OrganizationIntegrationReposPaginatedTest tests verify:
- First/second/last page pagination with cursor + Link headers
- Archived repos excluded
- installableOnly filter works with pagination
- Single-page results have no Link header
- Requests without per_page use the full list (existing behavior)
- Requests with search + per_page use the full list (search bypasses pagination)

Refs VDY-46

linear-code · 2026-04-09T17:13:48Z

VDY-46 SCM repo selector: pre-populate dropdown with initial repos and support pagination

src/sentry/integrations/api/endpoints/organization_integration_repos.py

wedamija · 2026-04-09T23:39:32Z

src/sentry/integrations/api/endpoints/organization_integration_repos.py

+
+            return response
+
        return self.respond({"detail": "Repositories not supported"}, status=400)


It might be worth seeing if there's a way to work this into using our generic self.paginate interface. I don't know if it will work well for an api call like this, but could be worth a quick try to see

wedamija · 2026-04-09T23:40:31Z

src/sentry/integrations/api/endpoints/organization_integration_repos.py

+                    per_page = max(1, min(int(request.GET["per_page"]), 100))
+                except (ValueError, TypeError):
+                    per_page = 100
+                cursor = self._parse_cursor(request)


I feel like we should already have api methods that handle producing our cursors, even if you're unable to use the built in self.paginate method it might be worth looking into those.

Switched to self.get_per_page() and self.get_cursor_from_request() in the latest push bc5a262.

The full self.paginate() flow doesn't seem to entirely work here since it expects a Django Paginator with a queryset, and we're paginating over an external API with a custom response shape ({"repos": [...], "searchable": bool}).

So the CursorResult construction for Link headers is still manual, but the input parsing now uses the standard helpers.

wedamija · 2026-04-09T23:47:28Z

src/sentry/integrations/github/integration.py

+        This is acceptable for infinite-scroll consumers.
+        """
+        client = self.get_client()
+        all_repos = client.get_repos_cached()


If we're just fetching all of these anyway, is there all that much value in paginating on the backend? I know we save on serialization + network cost, but we're still pulling all of these items into memory

Good call, forgot that /installation/repositories already supports page + per_page params and returns total_count (our existing interactions with that endpoint just aggregate all pages).

Going to rework this to just make a single GH API call per page instead of caching the full list and slicing.

Made that change in bc5a262

starting to think this pagination is not worth the trouble, the filtering we are applying post page fetch (installable_only, not "archived") is a problem.

only thing I can think that would solve that would be to have get_repositories_paginated fetch pages and filter until it assembles a full page of filtered results.

wedamija · 2026-04-09T23:48:53Z

src/sentry/integrations/source_code_management/repository.py

+        implement pagination (callers should fall back to
+        ``get_repositories()``).
+        """
+        return None


Probably raise NotImplementedError here?

Good point. The call site is already in a try/except anyway, so NotImplementedError would just be another clause. Will switch to that.

Made that change in bc5a262

Add a new /repos-paginated/ endpoint that paginates over the cached accessible repo list for GitHub. Other providers fall back to returning the full list from get_repositories(). This is purpose-built for the SCM onboarding repo selector dropdown, where we need fast initial page loads via cursor-based pagination. Search is intentionally excluded -- the FE uses the existing /repos/ endpoint with accessibleOnly for that. The pagination interface is provider-agnostic: - BaseRepositoryIntegration.get_repositories_paginated() returns None by default (unsupported) - GitHubIntegration overrides it to slice the Django-cached repo list - The endpoint dispatches based on the return value Refs VDY-46

This reverts commit fe092f3.

Add opt-in pagination to the existing /repos/ endpoint, triggered when a caller sends the per_page query param without a search query. Existing consumers that don't send per_page continue to receive the full repo list unchanged. The pagination interface is provider-agnostic: - BaseRepositoryIntegration.get_repositories_paginated() returns None by default (provider does not support pagination) - GitHubIntegration overrides it to slice the Django-cached repo list - The endpoint dispatches based on the return value, falling back to get_repositories() when pagination is not supported This supports the SCM onboarding repo selector, which needs fast page-at-a-time loading for orgs with many GitHub repos. Refs VDY-46

Promote the local to_repo_info closure to a method on GitHubIntegration so both get_repositories and get_repositories_paginated share the same formatting logic. Refs VDY-46

Clamp per_page to at least 1 to prevent infinite empty pages (per_page=0) or incorrect has_next (per_page=-1). Add docstring to get_repositories_paginated noting it always serves from the cached accessible-repos list. Refs VDY-46

Wrap get_repositories_paginated in try/except for IntegrationError and IdentityNotValid so token revocations return 400 instead of 500. Validate per_page as an integer (non-numeric input like ?per_page=abc defaults to 100 instead of raising ValueError). Clamp cursor offset to >= 0 so crafted cursors like 0:-5:0 cannot produce unexpected slicing behavior. Refs VDY-46

Add tests for input validation edge cases: per_page=0, per_page=-1, per_page=abc, per_page=200, negative cursor offset, and IntegrationError in the paginated path. Document that accessibleOnly is ignored in the paginated path and that CursorResult is only used for Link header generation. Remove redundant return_value=[] from test patch decorators. Refs VDY-46

Document cache consistency caveat in get_repositories_paginated docstring: cache expiry between pages can cause duplicates/skips, acceptable for infinite-scroll. Make _parse_cursor a staticmethod. Add comment clarifying per_page and offset are guaranteed set when paginated is not None. Refs VDY-46

Fix references to the old get_accessible_repos_cached name in get_repositories_paginated and pagination tests after rebase.

…icing Paginated repo browsing now makes a single GitHub API call per page via GET /installation/repositories?page=N&per_page=M instead of fetching all repos into a cache and slicing. This avoids loading the full repo list into memory on every request. Also switches to the built-in get_per_page() and get_cursor_from_request() helpers, raises NotImplementedError in the base class instead of returning None, and removes the custom _parse_cursor method. Refs VDY-46

Document that archived repos and installableOnly filtering can cause pages to have fewer items than per_page and has_next to overestimate. Add test for NotImplementedError fallback when a provider does not support paginated browsing. Refs VDY-46

github-actions · 2026-04-10T16:17:12Z

Backend Test Failures

Failures on ec4dd74 in this run:

tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py::OrganizationIntegrationReposPaginatedTest::test_not_implemented_falls_back_to_full_list

— log

[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py:546: in test_not_implemented_falls_back_to_full_list
    assert response.status_code == 200, response.content
E   AssertionError: b'{"detail":"Internal Error","errorId":null}'
E   assert 500 == 200
E    +  where 500 = <Response status_code=500, "application/json">.status_code

tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py::OrganizationIntegrationReposPaginatedTest::test_without_per_page_uses_full_list

— log

[gw0] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py:481: in test_without_per_page_uses_full_list
    assert response.status_code == 200, response.content
E   AssertionError: b'{"detail":"Internal Error","errorId":null}'
E   assert 500 == 200
E    +  where 500 = <Response status_code=500, "application/json">.status_code

tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py::OrganizationIntegrationReposPaginatedTest::test_search_with_per_page_uses_full_list

— log

[gw1] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/integrations/api/endpoints/test_organization_integration_repos.py:497: in test_search_with_per_page_uses_full_list
    assert response.status_code == 200, response.content
E   AssertionError: b'{"detail":"Internal Error","errorId":null}'
E   assert 500 == 200
E    +  where 500 = <Response status_code=500, "application/json">.status_code

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 9, 2026

vercel bot deployed to Preview April 9, 2026 17:15 View deployment

vercel bot deployed to Preview April 9, 2026 17:24 View deployment

vercel bot deployed to Preview April 9, 2026 17:30 View deployment

vercel bot deployed to Preview April 9, 2026 17:33 View deployment

jaydgoss marked this pull request as ready for review April 9, 2026 17:36

jaydgoss requested review from a team as code owners April 9, 2026 17:36

jaydgoss requested a review from a team April 9, 2026 17:36

vercel bot deployed to Preview April 9, 2026 17:37 View deployment

jaydgoss mentioned this pull request Apr 9, 2026

perf(github): Cache accessible repos for accessibleOnly search #112548

Merged

3 tasks

jaydgoss force-pushed the jaygoss/vdy-46-scm-repo-selector-pre-populate-dropdown-with-initial-repos-BE branch from 0bce46d to 41c5d02 Compare April 9, 2026 21:26

vercel bot deployed to Preview April 9, 2026 21:28 View deployment

sentry bot reviewed Apr 9, 2026

View reviewed changes

src/sentry/integrations/api/endpoints/organization_integration_repos.py Show resolved Hide resolved

wedamija reviewed Apr 9, 2026

View reviewed changes

jaydgoss marked this pull request as draft April 10, 2026 02:29

vercel bot deployed to Preview April 10, 2026 02:38 View deployment

jaydgoss mentioned this pull request Apr 10, 2026

feat(onboarding): Add paginated browse + search to repo selector #112644

Draft

5 tasks

vercel bot deployed to Preview April 10, 2026 02:56 View deployment

jaydgoss marked this pull request as ready for review April 10, 2026 03:06

Base automatically changed from jaygoss/vdy-68-perfgithub-optimize-accessibleonly-repo-search-to-avoid to master April 10, 2026 16:08

jaydgoss added 7 commits April 10, 2026 11:09

Revert "feat(github): Add paginated repos endpoint for SCM onboarding"

c62b01b

This reverts commit fe092f3.

ref(github): Extract _to_repo_info method for reuse

4b400c7

Promote the local to_repo_info closure to a method on GitHubIntegration so both get_repositories and get_repositories_paginated share the same formatting logic. Refs VDY-46

docs(github): Note installableOnly filtering after pagination slice

642a481

jaydgoss added 6 commits April 10, 2026 11:09

fix(github): Add type annotation for cursor_result

4760bef

ref(github): Update paginated path for get_repos_cached rename

8103616

Fix references to the old get_accessible_repos_cached name in get_repositories_paginated and pagination tests after rebase.

jaydgoss force-pushed the jaygoss/vdy-46-scm-repo-selector-pre-populate-dropdown-with-initial-repos-BE branch from e5fc935 to 000570b Compare April 10, 2026 16:09

vercel bot deployed to Preview April 10, 2026 16:11 View deployment

jaydgoss marked this pull request as draft April 10, 2026 16:55


		return response

		return self.respond({"detail": "Repositories not supported"}, status=400)

Uh oh!

Conversation

jaydgoss commented Apr 9, 2026

Summary

Test plan

Uh oh!

linear-code bot commented Apr 9, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaydgoss Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Backend Test Failures

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jaydgoss Apr 10, 2026 •

edited

Loading

github-actions bot commented Apr 10, 2026 •

edited

Loading